Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std: implement the random feature (alternative version) #129201

Merged
merged 5 commits into from
Sep 23, 2024

Conversation

joboet
Copy link
Member

@joboet joboet commented Aug 17, 2024

Implements the ACP rust-lang/libs-team#393.

This PR is an alternative version of #129120 that replaces getentropy with CCRandomGenerateBytes (on macOS) and arc4random_buf (other BSDs), since that function is not suited for generating large amounts of data and should only be used to seed other CPRNGs. CCRandomGenerateBytes/arc4random_buf on the other hand is (on modern platforms) just as secure and uses its own, very strong CPRNG (ChaCha20 on the BSDs, AES on macOS) periodically seeded with getentropy.

@rustbot
Copy link
Collaborator

rustbot commented Aug 17, 2024

r? @Noratrieb

rustbot has assigned @Noratrieb.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added O-hermit Operating System: Hermit O-SGX Target: SGX O-solid Operating System: SOLID O-unix Operating system: Unix-like O-wasi Operating system: Wasi, Webassembly System Interface O-windows Operating system: Windows S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Aug 17, 2024
@rust-log-analyzer

This comment has been minimized.

@joboet
Copy link
Member Author

joboet commented Aug 18, 2024

I'm nominating this for T-libs:

Do you consider a periodically reseeded, non-RC4-based arc4random_buf secure enough to serve as default random source on the non-Linux UNIX platforms?

@rustbot label +I-libs-nominated

@rustbot rustbot added the I-libs-nominated Nominated for discussion during a libs team meeting. label Aug 18, 2024
@rustbot
Copy link
Collaborator

rustbot commented Aug 18, 2024

The Miri subtree was changed

cc @rust-lang/miri

@RalfJung
Copy link
Member

RalfJung commented Aug 18, 2024

Thanks for adding the Miri shim!

Could you also add a test (in src/tools/miri/tests/pass/shims/) that calls the new public APIs, just to make sure we actually got them covered? (I assume they get called by HashMap as well but it'd be better to also have a direct test, in particular given that best_effort can lead to different codepaths being taken.)

Copy link
Contributor

@BlackHoleFox BlackHoleFox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you consider a periodically reseeded, non-RC4-based arc4random_buf secure enough to serve as default random source on the non-Linux UNIX platforms?

So I'm not t-libs, but familiar with OS randomness, so just going to chime in politely. The documentation for arc4random_buf states that it generates "cryptographic pseudo-random" and pretty much all modern UNIX distributions have moved on from RC4 to something better. For all the targets which use this function, does Rust officially support any OS versions which use weaker CSPRNG functions? I don't think we want the possibility of ever hitting a bad RNG "in the wild", so to speak.

Also related: I think the documentation of the RandomSource trait needs to state more guarantees of implementations. At this time its just too loose to ever let it be recommended over the getrandom crate or rand_core's CryptoRng. Is that intended per the accepted ACP?

@@ -0,0 +1,7 @@
pub fn fill_bytes(bytes: &mut [u8], best_effort: bool) {
if best_effort {
bytes.fill(4); // Chosen by fair dice roll. Guaranteed to be random.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a chance to improve this from the previous (1, 2) is to get a stack or heap pointer address?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Done.

all(target_family = "wasm", target_os = "unknown"),
target_os = "xous",
))] {
// FIXME: finally remove std support for wasm32-unknown-unknown
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a tracking issue to link to this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but perhaps we should open one.

library/std/src/sys/random/linux.rs Outdated Show resolved Hide resolved
library/std/src/sys/random/linux.rs Outdated Show resolved Hide resolved
/// The default random source.
///
/// This asks the system for the best random data it can provide, meaning the
/// resulting bytes *should* be usable for cryptographic purposes. Check your
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a problem to say should here because then its completely unreliable. imo std should guarantee this or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've improved the wording, it should be a bit more clear about our promises. If the platform has some specific requirements to ensure the strength of the RNG, we can't do anything about that, so we definitely should encourage people to consult the platform documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, makes sense. Its outside of std's control to enable all the various kernel configs etc.

@BlackHoleFox
Copy link
Contributor

that replaces getentropy with CCRandomGenerateBytes (on macOS)

It might be worth thinking about this change a bit more? Here's some numbers in case anyone else is curious. Funnily enough it actually appears to be faster to pull large amounts (>= 512) of bytes out of getentropy if you need "random numbers fast" because the overhead of the multithreading and reseeding CCRandomGenerateBytes needs to manage catches up.

rngs                fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ with_ccrandom                  │               │               │               │         │
│  ├─ 8             71.95 ns      │ 81.08 ns      │ 72.61 ns      │ 73.51 ns      │ 100     │ 6400
│  ├─ 16            239.2 ns      │ 270.5 ns      │ 241.8 ns      │ 243 ns        │ 100     │ 3200
│  ├─ 32            243.1 ns      │ 270.5 ns      │ 244.5 ns      │ 246.2 ns      │ 100     │ 3200
│  ├─ 64            247 ns        │ 271.8 ns      │ 249.6 ns      │ 251 ns        │ 100     │ 3200
│  ├─ 128           248.3 ns      │ 280.9 ns      │ 258.8 ns      │ 258.5 ns      │ 100     │ 3200
│  ├─ 256           291.3 ns      │ 327.8 ns      │ 294 ns        │ 296.9 ns      │ 100     │ 1600
│  ├─ 512           361.6 ns      │ 400.6 ns      │ 369.4 ns      │ 370.3 ns      │ 100     │ 1600
│  ├─ 1024          489.1 ns      │ 546.5 ns      │ 499.6 ns      │ 502.2 ns      │ 100     │ 800
│  ╰─ 2048          749.6 ns      │ 812.1 ns      │ 760 ns        │ 765.9 ns      │ 100     │ 800
╰─ with_getentropy                │               │               │               │         │
   ├─ 8             267.8 ns      │ 551.8 ns      │ 286.1 ns      │ 289.7 ns      │ 100     │ 1600
   ├─ 16            405.9 ns      │ 926.7 ns      │ 432 ns        │ 444.7 ns      │ 100     │ 1600
   ├─ 32            398.1 ns      │ 585.6 ns      │ 409.8 ns      │ 421.2 ns      │ 100     │ 1600
   ├─ 64            405.9 ns      │ 674.1 ns      │ 415 ns        │ 424.5 ns      │ 100     │ 1600
   ├─ 128           411.1 ns      │ 1.241 µs      │ 433.2 ns      │ 447.7 ns      │ 100     │ 1600
   ├─ 256           455.4 ns      │ 549.1 ns      │ 473.6 ns      │ 479.1 ns      │ 100     │ 1600
   ├─ 512           150.7 ns      │ 165 ns        │ 153.3 ns      │ 154 ns        │ 100     │ 3200
   ├─ 1024          150.7 ns      │ 165 ns        │ 153.3 ns      │ 153.9 ns      │ 100     │ 3200
   ╰─ 2048          150.7 ns      │ 165 ns        │ 153.3 ns      │ 154 ns        │ 100     │ 3200

@joboet
Copy link
Member Author

joboet commented Aug 19, 2024

that replaces getentropy with CCRandomGenerateBytes (on macOS)

It might be worth thinking about this change a bit more? Here's some numbers in case anyone else is curious. Funnily enough it actually appears to be faster to pull large amounts (>= 512) of bytes out of getentropy if you need "random numbers fast" because the overhead of the multithreading and reseeding CCRandomGenerateBytes needs to manage catches up.

rngs                fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ with_ccrandom                  │               │               │               │         │
│  ├─ 8             71.95 ns      │ 81.08 ns      │ 72.61 ns      │ 73.51 ns      │ 100     │ 6400
│  ├─ 16            239.2 ns      │ 270.5 ns      │ 241.8 ns      │ 243 ns        │ 100     │ 3200
│  ├─ 32            243.1 ns      │ 270.5 ns      │ 244.5 ns      │ 246.2 ns      │ 100     │ 3200
│  ├─ 64            247 ns        │ 271.8 ns      │ 249.6 ns      │ 251 ns        │ 100     │ 3200
│  ├─ 128           248.3 ns      │ 280.9 ns      │ 258.8 ns      │ 258.5 ns      │ 100     │ 3200
│  ├─ 256           291.3 ns      │ 327.8 ns      │ 294 ns        │ 296.9 ns      │ 100     │ 1600
│  ├─ 512           361.6 ns      │ 400.6 ns      │ 369.4 ns      │ 370.3 ns      │ 100     │ 1600
│  ├─ 1024          489.1 ns      │ 546.5 ns      │ 499.6 ns      │ 502.2 ns      │ 100     │ 800
│  ╰─ 2048          749.6 ns      │ 812.1 ns      │ 760 ns        │ 765.9 ns      │ 100     │ 800
╰─ with_getentropy                │               │               │               │         │
   ├─ 8             267.8 ns      │ 551.8 ns      │ 286.1 ns      │ 289.7 ns      │ 100     │ 1600
   ├─ 16            405.9 ns      │ 926.7 ns      │ 432 ns        │ 444.7 ns      │ 100     │ 1600
   ├─ 32            398.1 ns      │ 585.6 ns      │ 409.8 ns      │ 421.2 ns      │ 100     │ 1600
   ├─ 64            405.9 ns      │ 674.1 ns      │ 415 ns        │ 424.5 ns      │ 100     │ 1600
   ├─ 128           411.1 ns      │ 1.241 µs      │ 433.2 ns      │ 447.7 ns      │ 100     │ 1600
   ├─ 256           455.4 ns      │ 549.1 ns      │ 473.6 ns      │ 479.1 ns      │ 100     │ 1600
   ├─ 512           150.7 ns      │ 165 ns        │ 153.3 ns      │ 154 ns        │ 100     │ 3200
   ├─ 1024          150.7 ns      │ 165 ns        │ 153.3 ns      │ 153.9 ns      │ 100     │ 3200
   ╰─ 2048          150.7 ns      │ 165 ns        │ 153.3 ns      │ 154 ns        │ 100     │ 3200

Something is wrong with your benchmark code, getentropy takes the same time for 512, 1024 and 2048 bytes and that time is lower than getentropy for 8 bytes. Those numbers don't add up.

Edit: The issue is that getentropy only works for up to 256 bytes. Anything larger than that will generate an error, so you're benchmarking the time it takes for the error to be detected.

@joshtriplett
Copy link
Member

joshtriplett commented Aug 21, 2024

@BlackHoleFox

At this time its just too loose to ever let it be recommended over the getrandom crate or rand_core's CryptoRng. Is that intended per the accepted ACP?

That is not intended; it should ideally be at least as good as getrandom.

@joboet
Copy link
Member Author

joboet commented Aug 21, 2024

I'll improve the wording, this is just as good as getrandom (I am of the opinion that arc4random_buf is not weaker than getentropy). The documentation shouldn't however make false promises: the strength of the random numbers is platform dependent, and some platforms do not guarantee cryptographic strength in all circumstances (e.g. the strength of NetBSD's random generator depends on the system's configuration and ESP-IDF requires Bluetooth or Wi-Fi to be enabled for cryptographic strength). That's why people should ensure that they consult their target platform's documentation for the specific guarantees it provides.

@Noratrieb Noratrieb added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 21, 2024
@the8472
Copy link
Member

the8472 commented Aug 21, 2024

I'm nominating this for T-libs:

Do you consider a periodically reseeded, non-RC4-based arc4random_buf secure enough to serve as default random source on the non-Linux UNIX platforms?

We discussed this during today's T-libs meeting and did not manage to reach consensus because the tradeoffs involved in what we want to promise.

On BSDs where arc4random is implemented as a reseeded userspace CSPRNG that's properly wiped on fork it is probably fine as a fast RNG source that would even be appropriate for cryptographic uses in most scenarios. The issue is that there still are edge-cases where it's better to go directly to the kernel and have it tap hardware RNGs if those are available, e.g. to avoid state reuse when VMs get cloned.
That then lead to a discussion that we may need two sources, one that promises low latency and one that errs on the side of security even if that costs performance. But that's T-libs-api turf.

@bors

This comment was marked as resolved.

@joboet
Copy link
Member Author

joboet commented Aug 22, 2024

VM cloning is a good point! I'm not sure if providing two random sources is a good idea though, if we advertise the default as being "cryptographically secure" then people are going to use it as such without considering edge-cases. But that's a question for libs-api, as you said.

I'm not sure that the choice between arc4random and getentropy matters in regard to VM clones, however. For one, there are ways to protect userspace RNGs against it, e.g. the vDSO getrandom call uses a generation counter updated by the kernel to know when reseeding becomes necessary. And secondly, for the systems that don't do so, switching to getentropy won't help us, because they don't detect VM clones at all (at least, they don't use the VM_Gen_Counter ACPI device used by e.q. QEMU to signal VM cloning). FreeBSD is the only platform I could find where getentropy actually protects against VM forks while arc4random doesn't. I think we should contact the system maintainers here, make them aware of VM forks and ask them to also remember fixing arc4random when protecting against them.

@joboet joboet force-pushed the random_faster_sources branch from c02bc9e to 228bd6d Compare August 22, 2024 13:53
@joboet
Copy link
Member Author

joboet commented Aug 22, 2024

Rebased to reintroduce hashmap_random_keys, fix a Linux poll bug (a timeout of zero represents immediate timeout, not infinite as I though), use allocation addresses for weak random data on the unsupported platforms and add the miri test.

@joboet joboet force-pushed the random_faster_sources branch from 228bd6d to 6e78b60 Compare August 22, 2024 15:36
@joboet
Copy link
Member Author

joboet commented Aug 24, 2024

I'm filing issues with some of the respective platforms, here's the list so far:

OS issue status
FreeBSD https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281032
NetBSD kern/58632 (not public) resolved in HEAD, waiting for pullup
OpenBSD (not public)

@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Sep 22, 2024
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Sep 23, 2024
…shtriplett

std: implement the `random` feature (alternative version)

Implements the ACP rust-lang/libs-team#393.

This PR is an alternative version of rust-lang#129120 that replaces `getentropy` with `CCRandomGenerateBytes` (on macOS) and `arc4random_buf` (other BSDs), since that function is not suited for generating large amounts of data and should only be used to seed other CPRNGs. `CCRandomGenerateBytes`/`arc4random_buf` on the other hand is (on modern platforms) just as secure and uses its own, very strong CPRNG (ChaCha20 on the BSDs, AES on macOS) periodically seeded with `getentropy`.
@bors
Copy link
Contributor

bors commented Sep 23, 2024

⌛ Testing commit f95e4c9 with merge 9723fa6...

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 23, 2024
…triplett

std: implement the `random` feature (alternative version)

Implements the ACP rust-lang/libs-team#393.

This PR is an alternative version of rust-lang#129120 that replaces `getentropy` with `CCRandomGenerateBytes` (on macOS) and `arc4random_buf` (other BSDs), since that function is not suited for generating large amounts of data and should only be used to seed other CPRNGs. `CCRandomGenerateBytes`/`arc4random_buf` on the other hand is (on modern platforms) just as secure and uses its own, very strong CPRNG (ChaCha20 on the BSDs, AES on macOS) periodically seeded with `getentropy`.
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Sep 23, 2024

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Sep 23, 2024
@joboet joboet force-pushed the random_faster_sources branch from f95e4c9 to 3ff09a0 Compare September 23, 2024 08:37
@joboet
Copy link
Member Author

joboet commented Sep 23, 2024

I needed to reformat with the new 2024 rules.
@bors r=@joshtriplett

@bors
Copy link
Contributor

bors commented Sep 23, 2024

📌 Commit 3ff09a0 has been approved by joshtriplett

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 23, 2024
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Sep 23, 2024
…shtriplett

std: implement the `random` feature (alternative version)

Implements the ACP rust-lang/libs-team#393.

This PR is an alternative version of rust-lang#129120 that replaces `getentropy` with `CCRandomGenerateBytes` (on macOS) and `arc4random_buf` (other BSDs), since that function is not suited for generating large amounts of data and should only be used to seed other CPRNGs. `CCRandomGenerateBytes`/`arc4random_buf` on the other hand is (on modern platforms) just as secure and uses its own, very strong CPRNG (ChaCha20 on the BSDs, AES on macOS) periodically seeded with `getentropy`.
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 23, 2024
…kingjubilee

Rollup of 7 pull requests

Successful merges:

 - rust-lang#129201 (std: implement the `random` feature (alternative version))
 - rust-lang#130536 (bootstrap: Set the dylib path when building books with rustdoc)
 - rust-lang#130551 (Fix `break_last_token`.)
 - rust-lang#130657 (Remove x86_64-fuchsia and aarch64-fuchsia target aliases)
 - rust-lang#130721 (Add more test cases for block-no-opening-brace)
 - rust-lang#130736 (Add rustfmt 2024 reformatting to git blame ignore)
 - rust-lang#130746 (readd `@tgross35` and `@joboet` to the review rotation)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 1d36931 into rust-lang:master Sep 23, 2024
6 checks passed
@rustbot rustbot added this to the 1.83.0 milestone Sep 23, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Sep 23, 2024
Rollup merge of rust-lang#129201 - joboet:random_faster_sources, r=joshtriplett

std: implement the `random` feature (alternative version)

Implements the ACP rust-lang/libs-team#393.

This PR is an alternative version of rust-lang#129120 that replaces `getentropy` with `CCRandomGenerateBytes` (on macOS) and `arc4random_buf` (other BSDs), since that function is not suited for generating large amounts of data and should only be used to seed other CPRNGs. `CCRandomGenerateBytes`/`arc4random_buf` on the other hand is (on modern platforms) just as secure and uses its own, very strong CPRNG (ChaCha20 on the BSDs, AES on macOS) periodically seeded with `getentropy`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-hermit Operating System: Hermit O-SGX Target: SGX O-solid Operating System: SOLID O-unix Operating system: Unix-like O-wasi Operating system: Wasi, Webassembly System Interface O-windows Operating system: Windows S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants